{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Anachronism example\n",
    "\n",
    "This example takes a <a href=\"https://github.com/google/BIG-bench/tree/main/bigbench/benchmark_tasks/anachronisms\">simple task from BigBench</a>, where the goal is to identify whether a given sentence contains an anachronism (i.e. states something that is impossibile due to time periods)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2023-12-05 14:06:47.646285: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
      "2023-12-05 14:06:47.646969: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
      "2023-12-05 14:06:47.702073: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
      "2023-12-05 14:06:47.892827: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
      "To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
      "2023-12-05 14:06:58.127402: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loaded 46 data items\n"
     ]
    }
   ],
   "source": [
    "import datasets\n",
    "\n",
    "# load the data\n",
    "data = datasets.load_dataset('bigbench', 'anachronisms')\n",
    "inputs = [x.split('\\n')[0] for x in data['validation']['inputs']]\n",
    "labels = [x[0] for x in data['validation']['targets']]\n",
    "\n",
    "print(f\"Loaded {len(labels)} data items\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let us load a model into `guidance`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Attempting to load /mnt/c/Users/riedgar/Downloads/llama-2-7b.Q5_K_M.gguf\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "\n",
    "import guidance\n",
    "\n",
    "# define the model we will use\n",
    "# MODEL_PATH should point at the gguf file which you wish to use\n",
    "target_model_path = os.getenv(\"MODEL_PATH\")\n",
    "print(f\"Attempting to load {target_model_path}\")\n",
    "\n",
    "lm = guidance.models.LlamaCpp(target_model_path, n_gpu_layers=-1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now define an `anachronism_query()` function.\n",
    "This is a function, decorated with `@guidance` and which contains guidance instructions.\n",
    "The first argument to a function decorated like this is always a language model, and the function returns the same model after appending whatever strings and `guidance` instructions are required.\n",
    "\n",
    "In this case, we're going to take some few-shot examples in addition to the desired query, and build them into a prompt.\n",
    "We then provide `guidance` commands to step through some chain-of-thought (CoT) reasoning.\n",
    "Notice how we use the `stop` keyword to limit the generation before the next stage in the CoT (the model may go off the rails and generate more than we want in the first 'entity' generation call otherwise).\n",
    "In the final step, we use `guidance.select` to force the model to generate a 'Yes' or 'No' answer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "@guidance\n",
    "def anachronism_query(llm, query, examples):\n",
    "    prompt_string = \"\"\"Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or\n",
    "not based on the time periods associated with the entities).\n",
    "\n",
    "Here are some examples:\n",
    "\"\"\"\n",
    "    for ex in examples:\n",
    "        prompt_string += f\"Sentence: { ex['input'] }\" + \"\\n\"\n",
    "        prompt_string += \"Entities and dates:\\n\"\n",
    "        for en in ex['entities']:\n",
    "            prompt_string += f\"{en['entity']} : {en['time']}\" + \"\\n\"\n",
    "        prompt_string += f\"Reasoning: {ex['reasoning']}\" + \"\\n\"\n",
    "        prompt_string += f\"Anachronism: {ex['answer']}\" + \"\\n\"\n",
    "\n",
    "    llm += f'''{prompt_string}\n",
    "Now determine whether the following is an anachronism:\n",
    "    \n",
    "Sentence: { query }\n",
    "Entities and dates:\n",
    "{ guidance.gen(name=\"entities\", max_tokens=100, stop=\"Reason\") }'''\n",
    "    llm += \"Reasoning :\"\n",
    "    llm += guidance.gen(name=\"reason\", max_tokens=100, stop=\"\\n\")\n",
    "    llm += f'''\\nAnachronism: { guidance.select([\"Yes\", \"No\"], name=\"answer\") }'''\n",
    "    return llm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now invoke our function with a query string and some examples.\n",
    "Again, note how when we call `anachronism_query()` we _don't_ pass in the language model itself; the `@guidance` decorator takes care of that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style='margin: 0px; padding: 0px; padding-left: 8px; margin-left: -8px; border-radius: 0px; border-left: 1px solid rgba(127, 127, 127, 0.2); white-space: pre-wrap; font-family: ColfaxAI, Arial; font-size: 15px; line-height: 23px;'>Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or\n",
       "not based on the time periods associated with the entities).\n",
       "\n",
       "Here are some examples:\n",
       "Sentence: I wrote about shakespeare\n",
       "Entities and dates:\n",
       "I : present\n",
       "Shakespeare : 16th century\n",
       "Reasoning: I can write about Shakespeare because he lived in the past with respect to me.\n",
       "Anachronism: No\n",
       "Sentence: Shakespeare wrote about me\n",
       "Entities and dates:\n",
       "Shakespeare : 16th century\n",
       "I : present\n",
       "Reasoning: Shakespeare cannot have written about me, because he died before I was born\n",
       "Anachronism: Yes\n",
       "\n",
       "Now determine whether the following is an anachronism:\n",
       "\n",
       "Sentence: The T-Rex bit my dog\n",
       "Entities and dates:<span style='background-color: rgba(0.0, 165.0, 0, 0.15); border-radius: 3px;' title='1.0'>\n",
       "</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>T</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>-</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>Rex</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> :</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> </span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>6</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>5</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> million</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> years</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> ago</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>\n",
       "</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>D</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>og</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> :</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> present</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>\n",
       "</span>Reasoning :<span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> The</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> T</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>-</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>R</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>ex</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> is</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> ext</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>inct</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>,</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> so</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> it</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> cannot</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> b</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>ite</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> my</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> dog</span><span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'>.</span>\n",
       "Anachronism:<span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> Yes</span></pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "entities:\n",
      "T-Rex : 65 million years ago\n",
      "Dog : present\n",
      "\n",
      "reasoning:  The T-Rex is extinct, so it cannot bite my dog.\n",
      "answer: Yes\n"
     ]
    }
   ],
   "source": [
    "# define the few shot examples\n",
    "fewshot_examples = [\n",
    "    {'input': 'I wrote about shakespeare',\n",
    "    'entities': [{'entity': 'I', 'time': 'present'}, {'entity': 'Shakespeare', 'time': '16th century'}],\n",
    "    'reasoning': 'I can write about Shakespeare because he lived in the past with respect to me.',\n",
    "    'answer': 'No'},\n",
    "    {'input': 'Shakespeare wrote about me',\n",
    "    'entities': [{'entity': 'Shakespeare', 'time': '16th century'}, {'entity': 'I', 'time': 'present'}],\n",
    "    'reasoning': 'Shakespeare cannot have written about me, because he died before I was born',\n",
    "    'answer': 'Yes'}\n",
    "]\n",
    "\n",
    "# Invoke the model\n",
    "generate = lm + anachronism_query(\"The T-Rex bit my dog\", fewshot_examples)\n",
    "\n",
    "# Show the extracted generations\n",
    "print(\"entities:\\n{0}\".format(generate['entities']))\n",
    "print(f\"reasoning: {generate['reason']}\")\n",
    "print(f\"answer: {generate['answer']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For comparison purposes, we can also define a zero-shot function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style='margin: 0px; padding: 0px; padding-left: 8px; margin-left: -8px; border-radius: 0px; border-left: 1px solid rgba(127, 127, 127, 0.2); white-space: pre-wrap; font-family: ColfaxAI, Arial; font-size: 15px; line-height: 23px;'>Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or\n",
       "not based on the time periods associated with the entities).\n",
       "\n",
       "Sentence: The T-Rex bit my dog\n",
       "Anachronism:<span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> No</span>\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "answer: No\n"
     ]
    }
   ],
   "source": [
    "@guidance\n",
    "def anachronism_query_zeroshot(llm, query):\n",
    "    llm += f'''Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or\n",
    "not based on the time periods associated with the entities).\n",
    "\n",
    "Sentence: {query}\n",
    "Anachronism: { guidance.select([\"Yes\", \"No\"], name=\"answer\") }\n",
    "'''\n",
    "    return llm\n",
    "\n",
    "generate_zero = lm + anachronism_query_zeroshot(\"The T-Rex bit my dog\")\n",
    "\n",
    "# Show the extracted generations\n",
    "print(f\"answer: {generate_zero['answer']}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compute accuracy\n",
    "\n",
    "We compute accuracy on the validation set, and compare it to using the same two-shot examples above without the output structure, as well as to the best reported result <a href=\"https://github.com/google/BIG-bench/tree/main/bigbench/benchmark_tasks/anachronisms\">here</a>. We hope that a simple output structure will improve the accuracy of the results:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style='margin: 0px; padding: 0px; padding-left: 8px; margin-left: -8px; border-radius: 0px; border-left: 1px solid rgba(127, 127, 127, 0.2); white-space: pre-wrap; font-family: ColfaxAI, Arial; font-size: 15px; line-height: 23px;'>Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or\n",
       "not based on the time periods associated with the entities).\n",
       "\n",
       "Sentence: Vasco de Gama avoided shipwreck by the Cape of Good Hope thanks to his astrolabe.\n",
       "Anachronism:<span style='background-color: rgba(165.0, 0.0, 0, 0.15); border-radius: 3px;' title='0.0'> Yes</span>\n",
       "</pre>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "fews = []\n",
    "zero_shot = []\n",
    "count = 0\n",
    "for input, label in zip(inputs, labels):\n",
    "    print(f\"Working on item {count}\")\n",
    "    f = lm + anachronism_query(input, fewshot_examples)\n",
    "    f = 'Yes' if 'Yes' in f['answer'] else 'No'\n",
    "    fews.append(f)\n",
    "    g = lm + anachronism_query_zeroshot(input)\n",
    "    g = 'Yes' if 'Yes' in g['answer'] else 'No'\n",
    "    zero_shot.append(g)\n",
    "    count += 1\n",
    "fews = np.array(fews)\n",
    "zero_shot = np.array(zero_shot)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can compute the accuracy for each of the approaches:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Few-shot 0.41304347826086957\n",
      "Zero-shot 0.41304347826086957\n"
     ]
    }
   ],
   "source": [
    "print('Few-shot', (np.array(labels) == fews).mean())\n",
    "print('Zero-shot', (np.array(labels) == zero_shot).mean())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<hr style=\"height: 1px; opacity: 0.5; border: none; background: #cccccc;\">\n",
    "<div style=\"text-align: center; opacity: 0.5\">Have an idea for more helpful examples? Pull requests that add to this documentation notebook are encouraged!</div>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
